Communication optimization strategies for distributed deep neural network training: A survey
نویسندگان
چکیده
Recent trends in high-performance computing and deep learning have led to the proliferation of studies on large-scale neural network training. However, frequent communication requirements among computation nodes drastically slows overall training speeds, which causes bottlenecks distributed training, particularly clusters with limited bandwidths. To mitigate drawbacks communications, researchers proposed various optimization strategies. In this paper, we provide a comprehensive survey strategies from both an algorithm viewpoint computer perspective. Algorithm optimizations focus reducing volumes used while accelerating communications between devices. At level, describe how reduce number rounds transmitted bits per round. addition, elucidate overlap communication. discuss effects caused by infrastructures, including logical schemes protocols. Finally, extrapolate potential future challenges new research directions accelerate for
منابع مشابه
Large Scale Distributed Hessian-Free Optimization for Deep Neural Network
Training deep neural network is a high dimensional and a highly non-convex optimization problem. In this paper, we revisit Hessian-free optimization method for deep networks with negative curvature direction detection. We also develop its distributed variant and demonstrate superior scaling potential to SGD, which allows more efficiently utilizing larger computing resources thus enabling large ...
متن کاملExploring Strategies for Training Deep Neural Networks
Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization often appears to get stuck in poor solutions. Hinton et al. recently proposed a greedy layer-wise u...
متن کاملRegularization and Optimization strategies in Deep Convolutional Neural Network
Convolution Neural Networks, known as ConvNets exceptionally perform well in many complex machine learning tasks. The architecture of ConvNets demands the huge and rich amount of data and involves with a vast number of parameters that leads the learning takes to be computationally expensive, slow convergence towards the global minima, trap in local minima with poor predictions. In some cases, a...
متن کاملScalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization
Training neural network acoustic models with sequencediscriminative criteria, such as state-level minimum Bayes risk (sMBR), been shown to produce large improvements in performance over cross-entropy. However, because they entail the processing of lattices, sequence criteria are much more computationally intensive than cross-entropy. We describe a distributed neural network training algorithm, ...
متن کاملA conjugate gradient based method for Decision Neural Network training
Decision Neural Network is a new approach for solving multi-objective decision-making problems based on artificial neural networks. Using inaccurate evaluation data, network training has improved and the number of educational data sets has decreased. The available training method is based on the gradient decent method (BP). One of its limitations is related to its convergence speed. Therefore,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Parallel and Distributed Computing
سال: 2021
ISSN: ['1096-0848', '0743-7315']
DOI: https://doi.org/10.1016/j.jpdc.2020.11.005